[ToC] [Up] [Back] [Next] ... [Book Plug] The Information Commons
.................... Introduction to HTML

8.1 URLs for HTTP Servers

As most HTML is served from HTTP (HyperText Transfer Protocol) servers this is the most common URL you are likely to see. Consider the following examples: the first is to the CERN/W3C documents describing URLs, while the remaining two are to the home page from an older version of my own server (this address no longer exists):
http://www.w3.org/hypertext/WWW/Addressing/URL/Overview.html
http://www.hprc.utoronto.ca:3232/home.html
http://www.hprc.utoronto.ca:3232/

What does this mean? The first part http: means that the documents are served by an http server. The double slash (//) means that the next part is the name of the server. This can have two parts, the internet address of the server (essential) and the port number the server listens at (optional). In the first example www.w3.org the port number is not specified, so the browser assumes the default number for http servers (Port 80). In the second case URL tells the browser that the http server is at port 3232. The port is specified after the server name, separated by a colon.

The final element is the file or resource being requested: this is separated from the address+port number pair by a slash (/). The resource is specified by a path relative to to the root directory of the server. Thus the URL overview document at cern is found in the subdirectory .../URL/Overview.html with respect to the HTTP server root.

A file or resource specification beginning with /cgi-bin/ is usually special: in the case of my server, the cgi-bin directory is treated as a special directory reserved for programs/scripts that can be executed by the server. This is discussed in section 8.1.1 below.

If the file name is left out the server tries to send you a default directory file. Usually this is "index.html", but this can be modified (or turned off) by the server configuration files. You should always include the trailing slash if you are referencing a directory, for example /directory/ as otherwise the server will think you are requesting a file named directory as opposed to information about the directory.

8.1.1 Passing Parameters to the Server

The HTTP protocol support the passing of arguments to the server. The general format is to postpend the arguments to the URL, separated from the URL by a question mark (?). The reason for this notation is simple: most requests of this type are requests to search a database, and the passed arguments are the search parameters.

The general form is as follows. We

http://some.site.edu/cgi-bin/foo?arg1+arg2+arg3
What does this mean? There are two things to note:
cgi-bin
The cgi-bin directory is a special location known to the server, containing executable programs or scripts. The reason is obvious: you have to pass argument to something that can act on those arguments, implying a program or script. The cgi-bin directory contains programs/scripts that interface with the WWW - a URL can access and pass argument to programs/scripts in this directory, and these programs/scripts can in turn act on the arguments and return information, documents, etc. to the browser.
passed arguments
Arguments are appended to the URL, separated from it by a question mark (?). You can also send more than one argument, separated by a plus sign (+). Thus in the above the program/script foo is sent three arguments, arg1, arg and arg3.
For more information see the CERN/W3C URL documentation.

8.1.2 Personal HTML directories

Users can have html documents in their own home directories, distinct from the server hierarchy. The procedure for doing this depends on some degree on the server. In general the user needs to create a special file, placed in their home directory, that specifies where the personal 'root' html directory is. You access these files with the 'effective' path ~your_login_name/path/file. Again, this is a server-specific feature, and not all servers do this, or have this turned on. Ask your server manager for details about your local implementation.
[ToC] [Up] [Back] [Next] .................... Introduction to HTML

© Ian Graham 1994-1995 Page Last Updated: 4 December 1995